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Abstract — Security aspects of the Index Coding with Side 
Information (ICSI) problem are investigated. Building on the 
results of Bar-Yossef et al. (2006), the properties of linear coding 
schemes for the ICSI problem are further explored. The notion 
of weak security, considered by Bhattad and Narayanan (2005) 
in the context of network coding, is generalized to block security. 
It is shown that the coding scheme for the ICSI problem based 
on a linear code C of length n, minimum distance d and dual 
distance d^~, is (d — 1 — t) -block secure (and hence also weakly 
secure) if the adversary knows in advance t < d — 2 messages, 
and is completely insecure if the adversary knows in advance 
more than n — d -1 " messages. 

I. Introduction 

The problem of Index Coding with Side Information (ICSI) 
was introduced by Birk and Kol |T), Q. It was motivated by 
applications such as audio and video-on-demand, and daily 
newspaper delivery. In these applications a server (sender) 
has to deliver some sets of data, audio or video files to 
the set of clients (receivers), different sets are requested by 
different receivers. Assume that before the transmission starts, 
the receivers have akeady (from previous transmissions) some 
files or movies in their possession. Via a slow backward 
channel, the receivers can let the sender know which messages 
they already have in their possession, and which messages 
they request. By exploiting this information, the amount of 
the overall transmissions can be reduced. As it was observed 
in Q], this can be achieved by coding the messages at the 
server before broadcasting them out. 

Another possible application of the ICSI problem is in 
opportunistic wireless networks. These are the networks in 
which a wireless node can opportunistically listen to the 
wireless channel. As a result, the node may obtain packets 
that were not designated to it (see pl-p)). This way, a node 
obtains some side information about the transmitted data. 
Exploiting this additional knowledge may help to increase the 
throughput of the system. 

Consider the toy example in Figure 1 . It presents a scenario 
with one sender and four receivers. Each receiver requires a 
different information packet (or message). The naive approach 
requires four separate transmissions, one transmission per an 
information packet. However, by exploiting the knowledge 
about the subsets of messages that clients akeady have, and 
by using coding of the transmitted data, the server can satisfy 
all the demands by broadcasting just one coded packet. 

The ICSI problem has been a subject of several recent 
studies j3), j6[-p2). This problem can be regarded as a special 
case of the well-known network coding (NC) problem fl3). In 



hasxi,x 3 , x 4 hasx 2 , x 3 , X4 

demands x 2 demands x. 




hasxi, x 2 , x 4 hasxi,x 2 , x 3 

demands x 3 demands x 4 



Fig. 1 . An example of the ICSI problem 



particular, it was shown that every instance of the NC problem 
can be reduced to an instance of the ICSI problem Q, fit)) . 

Several previous works focused on the design of an efficient 
scheme for the ICSI problem. Bar-Yossef et al. [6| proved 
that finding the best scalar linear binary solution for the ICSI 
problem is equivalent to finding the so-called minrank of a 
graph, which is known to be an NP-hard problem (see (6), 
p4|). Here scalar linear solutions refer to linear schemes in 
which each message is a symbol in the field ¥ q . By contrast, 
in vector linear solutions each message is a vector over F„. 
Lubetzky and Stav [7| showed that there exist instances in 
which scalar linear solutions over nonbinary fields and linear 
solutions over mixed fields outperform the scalar linear binary 
solutions. The latter were also studied by Bar-Yossef et al. (6). 

El Rouayheb et. al. j3), (10) and Alon et al. (12| showed 
that for certain instances of the ICSI problem, vector linear 
solutions achieve strictly higher transmission rate than scalar 
linear solutions do. They also pointed out that there exist 
instances in which nonlinear codes outperform linear codes. 
Several heuristic solutions for the ICSI problem were proposed 
in (9), pj. 

In this paper, we study the security aspect of a linear 
solution for the ICSI problem. We show that every linear 
scheme provides a certain level of security. More specifically, 
let n and k be the length and the dimension of the code C, 
associated with a particular ICSI instance. Let d and d 1 - be 
its minimum distance and dual distance, respectively. We say 
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that a particular adversary is of strength t if it has t packets 
of information in its possession. Then, we show that a scheme 
employing the code C is (d — 1 — t)-block secure against all 
adversaries of strength t < d — 2 and is completely insecure 
against any adversary of strength at least n — d + 1. If the 
code C is MDS, then the two bounds coincide. 

The paper is organized as follows. Notations and definitions, 
which are used in the rest of the paper, are introduced in 
Section [TT] The model and some basic results for the ICSI 
problem are presented in Section [Til] The security properties 



of linear ICSI schemes are analyzed in Section IV The main 
results of this paper appear in that section. Finally, the paper 
is concluded in Section [Vj 

II. Preliminaries 

We use the notation ¥ q for the finite field of q elements, 
where q is a power of prime, and F* for the set of all nonzero 
elements of ¥ q . We also use [n] to denote the set of integers 
{1, 2, . . . , n}. For the vectors u = (ui,U2, ■ ■ ■ , u n ) £ F™ and 
v = (vi,v 2 , ■ ■ ■ ,v n ) £ F™, the (Hamming) distance between 
u and v is defined to be the number of coordinates where u 
and v differ, namely, 

d(u, v) = |{i £ [n] : m ^ Vi}\ . 

The support of a vector u £ ¥ q is defined to be the set 
supp(u) = {i £ [n] : Ui ^ 0}. The (Hamming) weight of 
a vector u, denoted wt(u), is defined to be |supp(u)|, the 
number of nonzero coordinates of u. 

A fc-dimensional subspace C of F™ is called a linear 
[n, k, d] q (<?-ary) code if the minimum distance of C, 

d(C) = min d(u, v) , 
uec, vec, u^v 

is equal to d. Sometimes we may use the notation [n, k] q for 
the sake of simplicity. The vectors in C are called codewords. It 
is easy to see that the minimum weight of a nonzero codeword 
in a linear code C is equal to its minimum distance d(C). A 
generator matrix G of an [n, fc] g -code C is a k X n matrix 
whose rows are linearly independent codewords of C. Then 
C = {yG : y £ ¥ k }. 

The dot product of the two vectors u, v <E ¥ q is defined 
to be u • v = 2~27=i u i v i € Fq. Thus, u • v = uv T , the 
normal matrix product of u and v T , where v T denotes the 
transpose of v. The dual code or dual space of C is defined 
as C = {u £ F™ : u • c = for all c £ C}. The minimum 
distance of C , d(C ), is called the dual distance of C. 

The following upper bound on the minimum distance of a 
<7-ary linear code is well-known (see fl5| Chapter 1). 



Theorem 2.1 (Singleton bound): For an [n, k, d] 9 -code, we 
have d < n — k + 1. 

Codes attaining this bound are called maximum distance 
separable (MDS) codes. For a subset of vectors 
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,c (fc) } cf; 



define its linear span: 

span({c« c( 2 ),...,c«} 

We use e.; = (0, . . . , 0, 1, 0, . . . , 0) £ ¥' q to denote the 

2—1 n— i 

unit vector, which has a one at the ith position, and zeros 
elsewhere. We also use I„, n £ N, to denote the nxn identity 
matrix. 

We recall the following well-known result in coding theory. 

Theorem 2.2 ( pffy , p. 66): Let C be an [n, k,d] q -code 
with dual distance d and M denote the q k xn matrix whose 
q k rows are codewords of C. If r < d 1 - — 1 then each r-tuple 
from ¥ q appears in an arbitrary set of r columns of M exactly 
q k ~ r times. 

For a random vector Y = (Yi, Y2, • ■ ■ ,Yn) an d a subset 
B = {«i,«2, ...,?&} of [n], where i\ < i 2 < • • • < %, let Y B 
denote the vector (Y^ , Yi 2 , . . . , Yi b ). For akxn matrix M, let 
Mj denote the jth column of M, and M[i] its ith row. For a 
set EC [n], let denote the k x \E\ matrix obtained from 
M by deleting all the columns of M which are not indexed 
by the elements of E. 

Let X and Y be discrete random variables taking values 
in the sets Ex and Ey, respectively. Let Pr(AT = x) denote 
the probability that X takes a particular value x £ Ex- The 
(binary) entropy of X is defined as 

H 2 (X) = - J2 P < X = x ) ■ lo 8'2 Pr (* = *) ■ 
The conditional entropy of X given Y is defined as 

H 2 (x|r) = 

Pv(X = x,Y = y)-\og 2 P r (X = x\Y = y). 

This definition can be naturally extended to 

H 2 (X\Y u Y 2 ,...,Y n ) , 

for n discrete random variables Yi, i £ [n]. 

If the probability distribution of X is unchanged given the 
knowledge of Y, i.e., Pr(X = x\Y = y) = Pr(X — x) for all 
x £ E x , y £ E y , then H 2 (X|F) = H 2 (X). Indeed, H 2 (X\Y) 
equals 



]T YI MX = x,Y = y) \ •log 2 Pr(X = a; ) 

x£Zx \yeSy / 

= - Pl ( X = X ) ■ l0 S2 Pr (^ = x ) 

= MX). 
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III. Index Coding and Some Basic Results 
A. Linear Coding Model 

Index coding problem considers the following communi- 
cations scenario. There is a unique sender (or source) S, 
who has a vector of messages x = {xi, X2, ■ ■ ■ , x n ) £ F™ 
in his possession, which is a realized value of a random 
vector X = ( X\ , X2 , ■ ■ ■ , X n ) . X\ , X% , ■ ■ ■ , X n hereafter 
are assumed to be independent uniformly distributed random 
variables over F q . There are also to receivers Ri, R2, . . . , R m - 
For each j £ [to], Rj has some side information, i.e. Rj owns 
a subset of messages {xi]i £ x r Xj Q [n]. In addition, each 
Rj, j £ [to], is interested in receiving the message Xfu\, 
for some demand function f : [to] — > [n]. Hereafter, we 
assume that every receiver requests exactly one message. The 
scenario, where each receiver requests more than one message, 
is discussed in Section IIII-BI 

Let Xq C [n] and u £ F™ . In the sequel, we write u < Xq 
if for any u, 7^ it holds i £ Xq. Intuitively, this means that 
if some receiver knows x.- L for all i £ Xq (and also knows u), 
then this receiver is also able to compute the value of u • x. 

In this paper we consider linear index coding. In particular, 
we assume that S broadcasts a vector of k £ N linear 



such that 



combinations s 
is of the form 



(si, S 2 , ■ • ■ , Sfe) £ 



F q , each combination 



ci j) x 2 



for j £ [k], where {c 



irk 



) u 2 J ' 



c n ] )}je[k] is a 



linearly independent set of vectors in F™. Let the code C of 
length n and dimension k over F q be defined as 



C = span({cW,c 



(1) -(2) 



}) 



Hereafter, we assume that the sets Xj for j £ [to] are known 
to S. Moreover, we also assume that the code C is known to 
each receiver Rj, j £ [to]. In practice this can be achieved by 
a preliminary communication session, when the knowledge of 
the sets Xj for j £ [to] and of the code C are disseminated 
between the participants of the scheme. 

The following lemma was formulated in |6) for the case 
where ¥ q is a binary field. This lemma specifies a sufficient 
condition on C so that the coding scheme is successful, i.e. 
any Rj has enough data to reconstruct Xfij\, j £ [m], at the 
end of the communication session. We reproduce this lemma 
(for the general ¥ q ) with its proof for the sake of completeness 
of the presentation. 

Lemma 3.1: Let C be an [n, k] q -code and let 
{cW,c( 2 ), . . . ,c( fe )} be a basis of C. Suppose S broadcasts 
vector s = (si,s 2 , • • • , s k ) = (c (1) • x, c 



.(2) . 



,c 



(fc) . 



Then, for each j £ [to], the receiver Rj can reconstruct Xf(j) 
if the following two conditions hold: 

1) there exists u £ F" such that u < Xj\ 

2) the vector u + Gf(j) is in C. 

Proof: Assume that u < Xj and u + e /(j) € C. Since 
u + e^(j) £ C, we obtain that there exist j3i, (32, . ■ . ,/3& £ ¥ q 



k 

(u + e /a) )+y>cC« = 



By multiplying by x, we obtain that 



( u + e /o))- x +E^( c(J) - x ) 

2=1 



(u + e /w )-x + ^& Sj =0. (1) 
3=1 



Therefore, 



c fU) 



E 



Pi 8 i 



Observe that Rj is able to find u and all f3j from the 
knowledge of the code C. Moreover, Rj is also able to compute 
u-x since u<\Xj. Therefore, Rj is able to compute Xfu-\. ■ 



Lemma 3. 1 suggests that in order for the receivers to recover 
their desired symbols, S can use the code C = span({v^' + 
e fU)}j e [m])> f° r some < Xj, j £ [to]. We show later 
in Corollary |4.3| that S must use a code of such form to 
guarantee a successful communication session. Finding the 
lowest dimension code by careful selection of v^'s is a 
difficult task (in fact it is NP-hard to do so, see [6], [14]), 
which, however, yields a scheme with the minimal number of 
transmissions. 

B. Receivers with Multiple Requests 

Consider a more general ICS I problem where each receiver 
requests more than one message. This problem was discussed 
in QJ. It was shown therein that there exists an equivalent 
problem with one requested message per each receiver. This 
new problem is easily obtained by splitting each receiver, 
which requests p > 1 messages, into p different receivers 
with the same side information, where each receiver requests 
exactly one message. For more detail, the reader can refer 
to |T). In the sequel, we consider scenarios, where each 
receiver requests exactly one message. 

C. Scalar and Vector Solutions 

The type of linear solutions considered in this model are 
referred to as scalar linear solutions in pO] , fl2| . For vector 
linear solutions, each message is divided into several packets, 
each packet is a symbol in ¥ q , and a coding scheme combines 
packets from different messages to minimize the number of 
transmissions. It was shown in fl2] (see also |3j, [10]) that 
there exist instances of the problem in which a vector linear 
solution has significantly higher transmission rate than any 
scalar linear solution. Here, the transmission rate of a scheme 
is defined as the number of packet transmissions required for 
delivery of one packet to each receiver. 

However, if each message consists of p packets (symbols 
in F q ), then a vector linear solution of this instance can 
be regarded as a scalar linear solution (over F ? ) of another 
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instance of the index coding problem, where each receiver 
requests exactly p messages in ¥ q . This instance, in turn, is 
equivalent to an instance of the ICSI problem considered in 
this paper. Therefore, the two identical g-ary linear codes of 
length n can be used for these two equivalent ICSI problems. 
In that case, in order to study the security of the vector linear 
solution of an instance of the ICSI problem, it is enough to 
study the security of the equivalent scalar linear solution of 
some other instance of the ICSI problem. 

IV. Block Secure Linear Index Coding 

A. Block Security and Weak Security 



In Section IV we present our main results. Hereafter, 
we assume the presence of an adversary A who can lis- 
ten to all transmissions. Let C be an [n, k] q -code, and 
{c^',c( 2 >, . . . ,c( fc '} be a basis of C. Let G be a generator 
matrix of C whose rows are c^\c^ , . . . ,c^ k ' . Suppose S 



broadcasts s = (si, S2, . . . , Sfc) — (c^ > ■ x. ' ■ x, 



x). The adversary is assumed to possess side information 
{xi}iex A , where Xa C [n]. For short, we say that A knows 
~X-x A - The strength of an adversary is defined to be \Xa\- 
Denote Xa = ([n]\A' J 4) 7^ 0. Note that from listening to 
S, the adversary also knows s T = Gx T . We define below 
several levels of security for ICSI schemes. 

Definition 4.1: Consider an ICSI scheme, which is based on 
a linear code C. The sender S possesses a vector of messages 
x G F™, which is a realized value of a random vector X. An 
adversary A possesses {xi]i^x A - 

1) For B C Xa, the adversary is said to have no information 
about xg if 

H 2 (X B |GX T ,X*J = H 2 (X S ). (2) 

In other words, despite the partial knowledge on x that the 
adversary has (his side information and the symbols he 
overheard), the symbols still looks completely random 
to him. 

2) The scheme is said to be b-block secure against Xa if for 
every 6-subset B C Xa, the adversary has no information 
about xb. 

3) The scheme is said to be b-block secure against all 
adversaries of strength t (0 ^ t ^ n — 1) if it is 6-block 
secure against Xa for every Xa C [n], \Xa\ = t. 

4) The scheme is said to be weakly secure against Xa if it is 
1-block secure against Xa- In other words, after listening 
to all transmissions, the adversary has no information 
about each particular message that he does not possess 
in the first place. 

5) The scheme is said to be weakly secure against all 
adversaries of strength t (0 ^ t ^ n — 1) if it is weakly 
secure against Xa for every i-subset Xa of [n]. 

6) The scheme is said to be completely insecure against Xa 
if an adversary, who possesses {x{\i^x A , by listening to 
all transmissions, is able to determine Xi for all i G Xa- 

7) The scheme is said to be completely insecure against any 
adversary of strength t (0 ^ t ^ n — 1) if an adversary, 



who possesses an arbitrary set of t messages, is always 
able to reconstruct all of the other n — t messages after 
listening to all transmissions. 

Even when the scheme is fe-block secure (b > 1) as defined 
above, the adversary is still able to obtain information about 
dependencies between various Xi's in Xa (but he gains no 
information about any group of b particular messages). This 
definition of 6-block security is a generalization of that of 
weak security (see flT) , fT8)). Obviously, if a scheme is b- 
block secure against Xa (b > 1) then it is also weakly secure 
against Xa, but the converse is not always true. 

B. Necessary and Sufficient Conditions for Block Security 

In the sequel, we consider the sets B C [n], B ^ 0, and 
E C [n], E 7^ 0. Moreover, we assume that the sets Xa, 
B, and E are disjoint, and that they form a partition of [n], 
namely Xa U B U E = [n]. In particular, X A — B U E. 

Lemma 4.1: Assume that for all u< Xa and for all on G ¥ q , 
i G B (not all o;,'s are zeros), 



C 



Then, 

1) for all i G B: 

2) the system 



Gi G span({Gj} jeE ) ', 



G E y T - G B w T 

\E\ 



(3) 



(4) 



(5) 



has at least one solution y G V q for every choice of 
wGF[ S| . 

Proof: 

1) If rank(Gs) = k then the first claim follows imme- 
diately. Otherwise, assume that rank(G£) < k. As 
the k rows of G# are linearly dependent, there exists 
y G F^\{0} such that yG E = 0. 

• If for all such y and for all i € B we have 
yG, = 0, then G 4 G ((span({G,} jeS )) ± ) ± = 
span({Gj}j £ £;) for all i G B. 

• Otherwise, there exist y € Fj and iEB such that 
yG E = and yG, 7^ 0. Without loss of generality, 
assume that G = (Gx A \G B \G E ). Let c = yG G C. 
Then 

c = (cx A \c B \c E ) = (yG^lyGslyGfi) . 

Hence = yG E 7^ and = yG^ = 0. Let 
u = (cx A \0\0)<Xa and on — C4 for all iEB. Then 
o<j's are not all zero and u + X^es a * e i = c G C, 
which contradicts ([3]). 

2) By Q, each column of is a linear combination 
of columns of G^. Hence G#w T is also a linear 
combination of columns of G E . Therefore, <|3j has at 
least one solution. 
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The following lemma provides us with a criteria to decide 
whether a particular scheme (based on a code C) is block 
secure against the adversary A or not. This lemma is a 
generalization of Lemma |3.1| in the following senses. First, 
while Lemma |3.1| does not discuss security, observe that the 
adversary can be viewed as one of the receivers. Then, the 



sufficient conditions (1) and (2) in Lemma 3.1 (when applied 



to A and Xa) are also sufficient conditions for successful 
reconstruction of a symbol by A. Below, we show that these 
conditions are also necessary. Additionally, in the lemma 
below, the weak security, implied by the aforementioned 
generalization, is further extended to block security. (Note that 
similar statement can be formulated with respect to receivers 
R 3 .) 

Lemma 4.2: For a subset B C Xa, the adversary, after 
listening to all transmissions, has no information about x B 
if and only if 

Vu < Xa, Vai € W q with a,-, i e 5, not all zero : 

u + J2 i c - (6) 

In particular, for each i ^ Xa, A has no information about Xi 
if and only if 

Vu < X A : u + e 4 ^ C . 

Proof: Assume that |6]l holds. We need to show that the 
entropy of Xg is not changed given the knowledge of GX T 
and H-x A ■ Hence, as shown in Section [II] it suffices to show 
that for all g G W l q Bl : 



Pr(X B = g|GX T = s T , X Xa = x*J = 



1 



(7) 



Consider the following linear system with the unknown z £ V r q 

ZB = g 

z x . = x Xa , 



Gz 1 



which is equivalent to 



g 



(8) 



[G s z! = s T -G B g T -G^x£ A 



In order to prove that dTli holds, it suffices to show that for 
all choices of g e ¥ q B \ (pH always has the same number of 
solutions z. Notice that the number of solutions z of |8]l is 
equal to the number of solutions z E of 



G s z|J 



s - G B g - G Xa x Xa , 

\B\ 



(9) 



where s, g, and x Xjl are known. For any g € ¥ q , if (|9| has a 
solution, then it has exactly q\ E \- Ta ^ k ( G E) different solutions. 
Therefore, it suffices to prove that ([9]) has at least one solution 
for every g 6 ¥ q B ^. 



Since x is an obvious solution of QSt, we have 
Ge^e =s T - G b *b ~ G Xa x Xa . 



(10) 



Subtract ( fT0] > from Q we obtain 

G E (zl-^) = G B (^l-g T ), 
which can be rewritten as 

G E y T = G B w T , 



(11) 



4.1 



(11 



where y — z E — x B , w = x B — g. Due to Lemma 
always has a solution y, for every choice of w. Therefore d9 

I B I 

has at least one solution for every g 6 ¥ q . 

Now we prove the converse. Assume that ([6]) does not hold 
Then there exists u < Xa and <E F g , i E B, where a/s 
i £ B are not all zero, such that 



a,;e, ; = c 



u 



for some c e C. Hence, similar to the proof of Lemma |3.1| 
the adversary obtains 



i£B 



^ ■ x 

\ieB / 

(c — u) ■ x 

C ■ X — U • X . 



Note that the adversary can calculate c ■ x from s, and can 
also find u • x based on his own side information. Therefore, 
A is able to compute a nontrivial linear combination of x^'s, 
i e B. Hence the entropy H 2 (X B |GX T , X Xa ) < H 2 (X B ). 
Thus, the adversary has some information about the x B . ■ 

We have the following straight-forward corollary. It general- 
izes Lemma [3~T| by providing both the necessary and sufficient 
conditions for the weak security. (Note, that this corollary 
considers the receiver Rj rather than the adversary A, since 
the arguments of Lemma 4.2 also apply to all receivers.) 



Corollary 4.3: For each j € [to], the receiver Rj can 
reconstruct £/(•/), f(j) ^ Xj, if and only if 

1 ) there exists u € F™ such that u < Xj ; 

2) the vector u + ef/j^ is in C. 



Corollary 4.3 suggests that in order for the receivers to 



recover their desired messages, it is necessary and sufficient to 
employ a code C of the form C = span({v^^ + e/(j)} 3 - e r m ]), 
for some < Xj, j G [to]. 

Theorem 4.4: Suppose that the source S broadcasts 



(c 



x) 



where {c^\c^ 2 \ . . . ,c^} is a basis of C = span({v^' + 
e f(j)}je[m]), f° r some v«) < Xj, j g [to]. Let d be the 
minimum distance of C. Then 

1) The scheme is (d— 1 — t) -block secure against all adver- 
saries of strength t ^ d — 2. In particular, the scheme is 
weakly secure against all adversaries of strength t = d—2. 

2) The scheme is not weakly secure against at least one 
adversary of strength t = d— 1. More generally, if there 
exists a codeword of weight w, then the scheme is not 
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weakly secure against at least one adversary of strength 
t = w-l. 

3) Every adversary of strength t < d— 1 is able to determine 
a list of q n - t ~ k vectors in F™ which includes the vector 
of messages x. 

Proof: 



1) Observe that by Corollary 4.3 every Rj, j E [m], can 
reconstruct Xfu\, f(j) ^ Xj. Assume that t < d — 2. By 



Lemma 4.2 it suffices to show that for every i-subset Xa 
of [n] and for every (d — 1 — i)-subset B of Xa, 

Vu < Xa, V«i E F q with a,, i g £?, not all zero : 

For such u and a^'s, we have wt(u + J2i<=B a i e i) — 
wt(u) + wt(X; ieS a t eA <t+(d-l-t)=d-Kd. 
Moreover, as supp(u) n B = and ctfj's, i E B, are 
not all zero, we deduce that u + J^ieB a i e i ^ ^. We 
conclude that u + ^ ieB aiej^C. 
2) We now show that the scheme is not weakly secure 
against at least one adversary of strength t = d— 1. The 
more general statement can be proved in an analogous 
way. 

Pick a codeword c = (ci, C2, ■ ■ ■ , c n ) E C such that 
wt(c) = d and let supp(c) = {ii,i2, ■ ■ ■ ,id}- Take 
<%a = {ii,«2)--->*d-i}. \ x a\ =d-l. Let 



u 



Then, u < Xa and u + e id = c/c id E C. By Lemma 



3.1 A is able to determine Xi,. Hence the scheme is not 



weakly secure against the adversary A, who knows d — 1 
messages lEj's in advance. 
3) Consider the following linear system of equations with 
unknown z € F™ 

I Gz T = s T ' 



which is equivalent to 



The adversary A attempts to solve this system. Given 
that s and xx A are known, the system ( 13 i has n — t 



unknowns and k equations. Note that t < d—1, and thus 
by Theorem 2.1 we have n — t > n — d + 1 > k. If 
rank(G^) = k then |l3| has exactly q n ~ t - k solutions, 
as required. 

Next, we show that rank(G^) = k. Assume, by con- 
trary, that the k rows of , denoted by r 1; r 2 , . . . , r^, 
are linearly dependent. Then there exist e F g , i £ [k], 
not all zero, such that Yli=i Pi T i = 0- Let 

k 

c = ^ft-G[i]6C\{0} ■ 



(Recall that G[i] denotes the i-th row of Q). Then 
c x A ~ Ej=i Pi T i = an d hence wt(c) = wt(c^ A ) ^ 
t < d—1. This is a contradiction, which follows from 
the assumption that the k rows of G^ are linearly 
dependent. 



Example 4.1: Let q — 2. Assume that Xa = and that 
Xj for all j E [to] . Consider a linear scheme for the ICSI 
problem, employing an [n, k, d] 2 -code C with d = 2, which is 
defined as follows. For each j E [to] choose some ij E Xj. 
Let C = span({e^ + e f(j)}je[m])- Then, indeed, d(C) = 2. 
Since t = \Xa\ = 0, we have d — 1 — t = 1. Therefore by 
Theorem |4.4| the scheme employing C is weakly secure against 
A. Moreover, if C is nontrivial (and so k < n — d + 1), we 
have k < n — d = n — 2. 

C. Block Security and Complete Insecurity 

Theorem |4.4| provides a threshold for the security level of 
a scheme that uses a given linear code C. If A has a prior 
knowledge of any t ^ d — 2 messages, then the scheme is 
still secure, i.e. the adversary has no information about any 
d—l—t particular messages from ■ On the other hand, 

the scheme may no longer be secure against an adversary of 
strength t = d—1. The last assertion of Theorem 4.4 shows us 



the difference between being block secure and being (strongly) 
secure in a commonly used sense (see, for instance fT9[). More 
specifically, if the scheme is (strongly) secure, the messages 
look completely random to the adversary, i.e. the probability 
to guess the correct messages is l/q n . However, if the scheme 
is (d— 1 — t)-block secure (for t < d — 2), then the adversary is 
able to guess the correct messages with probability l/q n ^ t ^ k . 

For an adversary of strength t ^ d, the security of the 
scheme depends on the properties of the code employed, in 
particular, it depends on the weight distribution of C. From 
Theorem |4.4| if there exists c E C with wt(c) 



if there exists c E C with wt(c) = w, then 
the scheme is not weakly secure against some adversary of 
strength t — w — 1. In general, the scheme might still be (6- 
block or weakly) secure against some adversaries of strength 
t for t > d. While we cannot make a general conclusion on 
(13) the security of the scheme when the adversary's strength is 



larger than Lemma 4.2 is still a useful tool to evaluate 



the security in that situation. However, as the next theorem 
shows, if the size of Xa is sufficiently large, then A is able 
to determine all the messages in {%i} i€ x A - 

Theorem 4.5: Suppose that the settings of the coding 



scheme for the ICSI problem are defined as in Theorem 4.4 



Then the scheme is completely insecure against any adversary 
of strength t ^ n—d +1, where d denotes the dual distance 
of C. 

Proof: Suppose the adversary knows a subset {xi\icx_ 
Xa C [n] and \X A \ = t > n - d 1 - + 1. By Corollary 
it suffices to show that for all i ^ Xa, there exists u E F™ 
satisfying simultaneously u < Xa and u + E C. 



4.3 
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Indeed, take any i £ Xa, and let p 



t < d x - 1. 



Consider the p indices which are not in Xa- By Theorem 2.2 
there exists a codeword c e C with 



1 i££ = i, 

if £ i X A U {i} 



Then supp(c) C^U {i}. We define ueFJ such that u <a 
Xa, as follows. For I E Xa, we set v,£ = q, and for £ ^ 
A"^, we set — 0. It is immediately clear that c = u + ej. 



Therefore, by Corollary 4.3 the adversary can reconstruct Xi. 
We have shown that the scheme is completely insecure against 
an arbitrary set Xa satisfying \Xa\ > n — d + 1, hence 
completing the proof. ■ 

When C is an MDS code, we have n — d + 1 = d — 1, 
and hence the two bounds established in Theorems 14.41 and 14. 5 1 
are actually tight. The following example further illustrates the 
results stated in these theorems. 

Example 4.2: Let n — 7, m = 7, q = 2. Suppose that the 
receivers have in their possession set of messages as appears 
in the third column of the table below. Suppose also, that the 
demands of all receivers are as in the second column of the 
table. 



Receiver 


Demand 




Ri 


Xi 


{x 6 ,x 7 } 


Ri 


X 2 




Rs 


X3 


{X5,X 6 } 


i?4 


X4 


{x 5 ,x 6 ,x 7 } 


R 5 


x 5 


{xi,x 2 ,x 6 } 




x 6 


{xi,X 3 ,X4,} 


R 7 


x 7 


{x 2 ,x 3 ,x 6 } 



For j = 1, 2, ... ,7, let v^' G such that supp(vW) 
Xj. Assume that this scheme uses the code C = span({v 



U). 



e j}je[7])- Then the set {c'-" = v^' +ej}j£\4] forms a basis 
for C. It is easy to see that this C is a [7, 4, 3] 2 Hamming code 
with d = 3 and d 1 - = 4. 

Suppose that S broadcasts the following four bits: 



si = (vW + ei) ■ x = c y *' ■ x 
s 2 = (V 2 ) + e 2 ) ■ x = ■ x , 
s 3 = (V 3 ) + e 3 ) ■ x = c( 3 ) • x , 
s 4 = (v( 4 ) + e 4 ) • x = cW • x . 
Each Rj, j = 1,2, ... ,7, can compute (v^J + e_y) • x by 
using linear combination of si, s 2 , S3, S4. Then, each Rj can 
subtract • x (his side information) from (v^ + ej) ■ x to 
retrieve Xj — ej ■ x. 

For example, consider R$. Since 

(v( 5 >+e 5 ).x= ((v( 1 )+e 1 ) + (v( 2 )+e 2 ))-x = Sl + S2 , 
R 5 subtracts x\ + x 2 + Xq from s x + s 2 to obtain 

(si + s 2 ) - (xi + x 2 + x e ) 



(xi 
x 5 . 



x 2 



xe) 



[Xi +X 2 + Xq) 



If an adversary A has a knowledge of a single message x i7 
then by Theorem |4.4| A is not able to determine any other 
message xg, for £ 7^ i. Indeed, d(C) = 3, while t = 1. 
Therefore, the scheme is weakly secure against all adversaries 
of strength t = 1. Similarly, if the adversary knows none of the 
messages in advance, then the adversary has no information 
about any group of 2 messages. On the other hand, the scheme 
is completely insecure against any adversary of strength t ^ 4; 
in that case A is able to recover the remaining n — t messages. 

D. Role of the Field Size 

The following example demonstrates that the use of codes 
over larger fields might have a positive impact on the security 
level. More specifically, in that example, codes over large 
fields significantly enhance the security, compared with the 
codes over small field. 

Example 4.3: Suppose that the source S has n messages 
xi,x 2 , . . . ,x n . Assume that there are m < n receivers 
R\,R 2 , . . . ,R m , and each receiver Rj has the same set of 
side information, Xj = {m + 1, m + 2, . . . , n}. Assume also 
that each Rj requires Xj, for j £ [m]. 

We can define the ICSI scheme based on the code C, as 
above. The code employed in this scheme has dimension at 
least m, since all the vectors v"-* + e^-, for some v") < Xj, 
j € [to], are linearly independent. Therefore, the number of 
transmission required in this scheme is at least m, which is 
equal to the number of transmissions in the trivial solution 
(just broadcasting each of x%, x%, . . . , x m ). 

If we employ a binary code C, for the large values of n 
the minimum distance d of C is bounded from above by a 
sphere-packing bound 

d^2n- (H2 X (1 - m/n) - e) , 

where e — > as n — > 00. Hence the scheme, which uses a 
binary code, is secure against any adversary of strength t ^ d— 
2. It is insecure against some adversaries of strength t ^ d— 1. 

There is a variety of stronger upper bounds on the minimum 
distance of binary codes, such as the Johnson bound, the Elias 
bound, and the McEliece-Rodemich-Rumsey- Welch bound 
(see (20, Chapter 4.5] for more details). These bounds provide 
even stronger bounds on the security of this ICSI scheme, 
when the scheme is based on the binary code. 

By contrast, consider an g-ary code C, for q > n + 1 
(we also assume here that all Xi are in ¥ q ). There exists 
a g-ary MDS code C of length n, dimension m, and with 
the minimum distance equal n — m + 1 (for example, Reed- 
Solomon code). By employing this code, the new scheme is 
secure against all adversaries of strength t ^ n — m — 1. In 
order to find an appropriate generator matrix for the Reed- 
Solomon code for the settings of this example, we start with 
some generator matrix of Reed-Solomon code, and then apply 
Gaussian elimination to obtain a new generator matrix of the 
form G = (I,,„|P), where P is a m x (n — m) matrix over 
F 

It is well known that there is a significant gap between 
the Singleton bound and the sphere-packing bound (see [20 
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p. Ill] for details). Therefore, for some ICSI instances, coding 
over large fields can provide significantly higher levels of 
security than binary coding. 

V. Conclusion and Open Questions 

In this paper, we analyze the levels of security of linear 
solutions for the ICSI problem and establish two new bounds. 
These bounds employ the minimum distance and the dual 
distance of the linear code used in the scheme. While the 
dimension of this code corresponds to the number of trans- 
missions in the scheme, the minimum distance is related to 
the security of the scheme. The generating matrix of the code 
depends on the sets of messages that each receiver owns. 

However, there are various generating matrices that can be 
used for the same instance of ICSI problems. Moreover, punc- 
turing of some nonzero entries in the generating matrix, could 
probably lead to a code with a better minimum distance, which 
in turn corresponds to a ICSI scheme with stronger security. 
Thus, the question which remains open is how to design a 
code for a particular instance of the ICSI problem, which has 
the largest possible minimum distance. It is very likely that 
finding such a code is a hard problem. For comparison, even 
finding the minimum distance of a code given by its generating 
matrix is known to be NP-hard (21]. 

The following simple generalization of the ICSI problem 
is called Network Coding with Side Information (NCSI) 
problem. Consider a network with a sender S, possessing n 
messages, and m receivers Ri, R2, . . . , R m . Each Rj requests 
one message. Suppose that each Rj has some side information, 
namely Rj knows some subset of these n messages. There is 
also an adversary A, listening to some links in the network, 
who possesses some of the messages. Given an instance of the 
NCSI problem, the following questions arise: 

1) Is it possible to satisfy all the requests simultaneously by 
a single transmission, using linear network coding? 

2) If there exists network coding solution, how secure is it? 
Some techniques, presented in this paper, can be extended to 
provide sufficient (and sometimes necessary) conditions for an 
existence of a linear solution for the NCSI problem, and to 
analyze the level of security of such a solution. We omit the 
details from this paper. 
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